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Abstract —We introduce a new family of erasure codes, called 
group decodable code (GDC), for distributed storage system. 
Given a set of design parameters { a , /3, k, / }, where k is the 
number of information symbols, each codeword of an (a, /3, k, t)- 
group decodable code is a /-tuple of strings, called buckets, such 
that each bucket is a string of (3 symbols that is a codeword of a 
[/3, a] MDS code (which is encoded from a information symbols). 
Such codes have the following two properties: 

(PI) Locally Repairable: Each code symbol has locality {a,P - 

ex + 1 ). 

(P2) Group decodable: From each bucket we can decode a 
information symbols. 

We establish an upper bound of the minimum distance of 
(a, (3, k, f)-group decodable code for any given set of {a, 13, k, t}; 
We also prove that the bound is achievable when the coding field 

F has size |F| > (£l{). 

E Introduction 

Distributed storage systems (DSS) are becoming increas¬ 
ingly important due to the explosively grown demand for large- 
scale data storage, including large files and video sharing, 
social networks, and back-up systems. Distributed storage 
systems store a tremendous amount of data using a massive 
collection of distributed storage nodes and, to ensure reliability 
against node failures, introduce a certain of redundancy. 

The simplest form of redundancy is replication. DSS with 
replication are very easy to implement, but extremely inef¬ 
ficient in storage efficiency, incurring tremendous waste in 
devices and equipment. In recent years, some efficient schemes 
for distributed storage systems, such as erasure codes JTJ and 
regenerating codes El, are proposed. We focus on erasure 
codes in this paper. 

MDS codes are the most efficient erasure codes in term of 
storage efficiency. When use an [n, k] MDS code, the data file 
that need to be stored is divided into k information packets, 
where each packet is a symbol of the coding field. These k 
information packets are encoded into n packets and stored in 
n storage nodes such that each node stores one packet. Then 
the original file can be recovered from any k out of the n 
coded packets. Although MDS code is storage optimal, it is 
not efficient for node repair. That is, when one storage node 
fails, we must download the whole file from some other k 
nodes to reconstruct the coded packet stored in it. 

To construct erasure codes with more repair efficiency than 
MDS codes, the concepts of locality and locally repairable 


code (LRC) were introduced [0, 0, 0. Let 1 < a < k 
and 5 > 2. The ?'th code symbol c, (1 < i < n) in an [n,k\ 
linear code C is said to have locality (a, 5) if there exists 
a subset Si C [n] = {1,2, ,n} containing i and of size 

| St I < a + 5 — 1 such that the punctured subcode of C to 
Si has minimum distance at least 5. We will call each subset 
{cj]j £ Si} a repair group. Thus, if c, has locality (a, 6), then 
Ci can be computed from any |Si| — <5 + 1 other symbols in the 
repair group {cj\j £ Si}. A code is said to have all-symbol 
locality [a, S) (or is called an (a,S) a code) if all of its code 
symbols have locality (a, 6). Note that |^ | — (5+1 < a. The 
code has a higher repair efficiency than MDS code if a < k. 
The minimum distance of an (a, S) a linear code is bounded 
by (See 0) : 

d < n — k + 1 — 

However, for the case that (a + S — 1) { n and a\k, there exists 
no (a, 6) a linear code achieving the above bound |6l . 

The most common case of ( a,5) a linear code is that n 
is divisible by a + 5 — 1. For this case, in the constructions 
presented in the literature, all code symbols of an (a, S) a linear 
code are usually divided into t = c ^_ 1 mutually disjoint 
repair groups such that each repair group is a codeword of an 
[a + <5 — 1, a] MDS code. Fig.[j]illustrates a (4,3) a systematic 
linear code with n = 18 and k = 6, where xi, ■ ■ ■ ,xq are 
the information symbols and y i, • • ■ , y\ 2 are the parities. All 
code symbols are divided into three groups and each group is 
a codeword of a [6,4] MDS code. By constructing the parities 
elaborately, the code can be distance optimal according to 0. 
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Fig 1. Illustration of a systematic locally repairable code: The information 
symbols xi, - • • ,xq are encoded into xi, • • • , xe, yi, ■ • • , yi2 that are 
divided into three groups. Each group is a codeword of a [6,4] MDS code. 

As pointed out in ED, in distributed storage applications 
there are subsets of the data that are accessed more often than 
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the remaining contents (they are termed “hot data”). Thus, a 
desired property of a distributed storage system is that the 
subsets of hot data can be retrieved easily and by multiple 
ways. For example, for the storage system illustrated by Fig.0 
suppose x\ is hot data. There are two “easy ways” to retrieve 
it from the system: Downloaded X\ directly from the node 
where it is stored, or decode it from any four coded symbols 
in the first group. Another way is to decode it from some six 
coded symbols, but this is not an easy way because to decode 
X\, one has to decode the whole data file. 


Bucket 1: 


Bucket 2: 


Bucket 3: 


Fig 2. Illustration of a (4, 6, 6, 3)-group decodable code: ,x@ 

are information symbols and z \, - - ■ ,ze are parities. Each codeword has 3 
buckets and each bucket is a codeword of a [6, 4] MDS code that is encoded 
from 4 information symbols. Clearly, each bucket is a repair group. 


X\ Xi X 3 X4 Z\ Z2 


Xi X 2 X 5 X S £3 Z4 


X 3 X4 X 5 X 6 Z 5 Z 6 


In this work, we introduce a new family of erasure codes, 
called group decodable code (GDC), for distributed storage 
system, which can provide more options of easy ways to 
retrieve each information symbol than systematic codes. Given 
a set of design parameters {a, /3, k , t}, where k is the number 
of information symbols, each codeword of an (a,/3,k,t)- 
group decodable code is a f-tuple of strings, called buckets, 
such that each bucket is a string of /3 symbols that is a 
codeword of a [/3, a] MDS code (which is encoded from a 
information symbols). So such codes have the following two 
properties: 

(PI) Locally Repairable : Each code symbol has locality 

(a, /3 — a + 1). 

(P2) Group decodable : From each bucket we can decode a 

information symbols. 

Fig. U illustrates a (4, 6,6, 3)-group decodable code. There 
are six information symbols Xi, ■ ■ • . xy. Each codeword has 3 
buckets and each bucket is a codeword of a [6,4] MDS code 
that is encoded from 4 information symbols. Clearly, each 
bucket is a repair group. So each code symbol of this code 
has locality (4,3). Moreover, this code provides more options 
of easy ways to retrieve each information symbol than the 
code in Fig. [Q For example, xi can be downloaded directly 
from two nodes or can be decoded from any four symbols in 
bucket 1 or any four symbols in bucket 2. In the case that x\ is 
requested simultaneously by many users of the system, the can 
ensure that multiple read requests can be satisfied concurrently 
and with no delays. 

A. Our contribution 

We establish an upper bound of the minimum distance 
of group decodable code for any given set of parameters 
{a, /?, k. t \ (Theorem 0. We also prove that there exist linear 
codes of which the minimum distances achieve the bound, 


which proves the tightness of the bound (Theorem 0. Our 
proof gives a method to construct (cr, /3, k, f)-group decodable 
code on a field of size q > «:!)• where n = i/3 is the length 
of the code. 

B. Related Work 

Some existing works consider erasure codes for distributed 
storage that can provide multiple alternatives for repairing 
information symbols or all code symbols with locality. 

In 0 , the authors introduced the metric “local repair 
tolerance” to measure the maximum number of erasures that 
do not compromise local repair. They also presented a class 
of locally repairable codes, named pg-BLRC codes, with high 
local repair tolerance and low repair locality. However, they 
did not present any bound on the minimum distance of such 
codes. 

In 0 , the concept of (ct, 5) c -locality was defined, which 
captures the property that there exist 5—1 pairwise disjoint 
local repair sets for a code symbol. An upper bound on the 
minimum distance for [ n , k] linear codes with information 
(ct, <5) c -locality was derived, and codes that attain this bound 
was constructed for the length n > k(a(S — 1) +1). However, 
for n < k(a(S — 1) + 1), it is not known whether there exist 
codes attaining this bound. Upper bounds on the rate and 
minimum distance of codes with all-symbol (a, 5) c -locality 
was proved in 0 - However, no explicit construction of codes 
that achieve this bound was presented. It is still an open 
question whether the distance bound in 0 is achievable. 

Another subclass of LRC is codes with (r, f)-locality: any 
set of t, code symbols are functions of at most r other code 
symbols EH- Hence, for such codes, any t, failed code symbols 
can be repaired by contacting at most r other code symbols. An 
upper bound of the minimum distance of such codes similar 
to (Qi is derived in ifTTl . 

C. Organization 

The paper is organized as follows. In Section II, we present 
the related concepts and the main results of this paper. We 
prove the main results in section III and Section IV. 

II. Model and Main Result 

Denote [n] := {1, • • ■ , n } for any given positive integer n. 
Let F be a finite field and k be a positive integer. For any 
S = {*!,•■• ,i a } C [k], the projection of F fc about S' is a 
function tps '■ F fc —>■ F“ such that for any (x±, ■ ■ ■ ,Xk) £ ¥ k , 

ips{xi,- ■■ ,x k ) = (xjj, ■ • • ,x ia ). (2) 

We can define group decodable codes (GDC) as follows. 

Definition 1: Suppose S = {Si, • • • , St} is a collection of 
subsets of [fc] and A/" = {ni, • • • , rit} is a collection of positive 
integers such that [J \ =1 Si = [fc] and n, > ki = |S»|, Vi £ [t\. 
A linear code C is said to be an (A f, S)-group decodable code 
(GDC) if C has an encoding function / of the following form: 

/ : F fc —>■ F” 1 x • • • x F”‘ 

x ^ (fi(ipSi{x)),--- ,ft(ips t {x))). (3) 

























































where each /, : F fei —>• F ni is an encoding function of an 
[rii, ki] MDS code and the output of it is called a bucket. 

By Definition [Q if C is an (AA, <S)-group decodable code, 
then C has length n = Y^i=i n i- For an y message vector x = 
(xi, • ■ • ,Xk) and i £ [f], the subset of ki messages { Xj ; j £ 
Si} are encoded into a bucket of n, symbols by the function 
/i. A codeword of C is the concatenation of these t buckets. 
Since is an encoding function of an [n,, k,] MDS code, 
each bucket is a repair group and we can decode the subset 
{xj]j £ Si} from any ki symbols of the z'th bucket.—The 
term “group decodable code” comes from this observation. 

For the special case that Si,-- - ,St are pairwise disjoint, 
an (A/", <S)-group decodable code C is just the direct sum of 
the t buckets and the minimum distance of C is min{rii — ki + 
1; i £ [t]}. In this work, we consider the most general case 
that Si, - - • ,St can have arbitrary intersection. 

Definition D] depends on the explicit collections S and J\f. 
We can also define GDC based on design parameters. 

Definition 2: Let a, /?, k, t be positive integers such that 
a < min{fc,/3}. A linear code C is said to be an ( a,/3,k,t )- 
group decodable code if C is an (J\f, <S)-group decodable code 
for some S = {Si, • • • , St} and A f = {ni, • • • , nt} such that 
Si C [k], |Si| = a and n, = (3 for all i £ [£]. 

If C is an (a, f3, k, f)-group decodable code, then by Def¬ 
inition [2] the length of C is n = t/3. Moreover, since 
UU = [k] an d | St | = a, then ta = Y^i=i 1^1 — which 
implies that LxJ > 1. So we have the following remark. 

Remark 3: If C is an (a, (3, k. f)-group decodable code, then 
n = t/3 and > 1. 

We will give a tight upper bound on the minimum distance 
d of an ( a , /?, k, f)-group decodable code C. Our main results 
are the following two theorems. 

Theorem 4: Let ta = sk + r such that s > 1 and 0 < r < 
k — 1. If C is an (a, j3, k, f)-group decodable code, then 


d < s/3 



+ 1 . 


(4) 


Note that an (a, fi, k, f)-group decodable code is an (r,S) a 
with the additional property (P2). So the bound © is looser 
than the bound 0- The sacrifice in minimum distance is 
resulted from the property (P2). 

Theorem 5: If |F| > ), then there exists an {a, (3, k, t)- 

group decodable code over F with d achieves the bound 0. 

By Remark [3 ] ta > k. So we always have ta = sk + r for 
some s > 1 and 0 < r < k — 1. So Theorem [4] and 0 covers 
all possible sets of parameters {a, f3,k,t}. 


III. Proof of TheoremED 

In this section, we prove Theorem 0] We will use some 
similar discussions as in CSl, El, El- 

In the rest of this paper, we always assume that S = 
{Si, ■ ■ ■ , St} is a collection of subsets of [fc] and A f = 
{ni, ■ ■ ■ ,nt} such that U*=i S% = [fc] and m = /3 > |5j| = a 
for all i £ [t\. Moreover, let n = t/3 and 


Ji = {(* — l)/3 + l)/3 + 2, • • • , i/3}. (5) 


Clearly, Ji, • • • , J* are pairwise disjoint and (Ji=i Ji = VA- 
Let £ be any positive integers and A be any k x l matrix. If 
J C [£], we use Aj to denote the sub-matrix of A formed by 
the columns of A that are indexed by J. Moreover, we will 
use the following notations: 

1) For i £ [k] and j £ {£], Ra{i) and Cfi(j) are the support 
of the zth row and the jth column of A respectively. 
Meanwhile, |i? J 4 (i)| and |CA(j)| are called the weight 
of the tth row and the yth column of A respectively. 

2) The minimum row weight of A is 

w m in(A) = min |iiU(i)l- (6) 

ie[fc] 

The tth row is said to be minimal if |f? J 4 (t)| = ii; m i n (A). 

3) The repetition number of the zth row, denoted by IA (z), 
is the number of i' £ [fc] such that Ra{J) = Ra(i)- Let 
T’yi be the set of indices of all minimal rows of A. We 
denote 


r(A) = maxT^fi). (7) 

Clearly, we always have T{M) > 1. The following example 
gives some explanation of the above notations. 

Example 6: Consider the following 7x8 binary matrix 


A = 


1 0 1 0 0 0 1 0 
0 10 10 10 0 
0 0 10 10 11 
10001101 
0 10 10 10 0 
1 0 1 0 0 0 1 0 
0 10 110 0 1 


We have R A { 1) = R A { 6)- So F j4 (1) = F j 4 (6) = 2. Similarly, 
I’a (2) = IA (5) = 2 and the repetition number of all other 
rows are 1. Note that tUnri n (A) = 3 and the minimal rows of 
A are indexed by (1, 2, 5,6}. Then T(A) = 2. 

To prove Theorem EQ we first give a description of (J\f, S)- 
group decodable codes using their generator matrix. To do 
this, we need the following two definitions. 

Definition 7: Let M = (rriij)kxn be a binary matrix and 
G = ( a.i t j)kxn be a matrix over F. We say that G is supported 
by M if for all i £ [fc] and j £ [n], rriij = 0 implies atj = 0. 
If C is a linear code over F and has a generator matrix G 
supported by M, we call M a support generator matrix of C. 

Definition 8: Let Mq be a k x t binary matrix and M be 
a k x n binary matrix such that Cm 0 U) = Sj for all j £ [£] 
and CmU) = Si for all i £ [t] and j £ Ji. We call Mq the 
incidence matrix of S and M the indicator matrix of (fif,S). 

Remark 9: Since (J*=i Si = [k] and Cm 0 (*) = Si for all 
i £ [t ], then by Definition [8] each row of Mq has at least one 
1 and each column of Mq has exactly a Is. Moreover, by © 
and Definition [3 M is extended from Mq by replicating each 
column of Mq by (3 times. Hence, each row of M has at least 
f3 Is and each column of Mq has exactly a Is. 

Now, we can describe (A f, <S)-group decodable codes using 
their generator matrix. 







Lemma 10: Let M be the indicator matrix of (TV, 5). Then 
C is an (TV, S )-group decodable code if and only if C has a 
generator matrix G satisfying the following two conditions: 

(1) G is supported by M\ 

(2) rank(Gj) = a for each i £ [t] and J C J, with | J| — a.. 

Proof: This lemma can be directly derived from Defini¬ 
tion Q] and [8] ■ 

For any [n, k] linear code C, the well-known Singleton 
bound ([15, Chi]) states that d < n—k+1. On the other hand, 
we always have d > 1. So it must be that d = n — k + 1 —5 for 
some S £ {0,1, • ■ ■ , n — k}. The following lemma describes 
a useful fact about d for any linear code [20). 

Lemma 11: Let C be an \n. k, d] linear code and G be a 
generator matrix of C. Let 0 < S < n — k. Then d > n — k + 
1 — 8 if and only if any k + 5 columns of G has rank k. 

Using this lemma, we can give a bound on the minimum 
distance of any linear code by its support generator matrix. 

Lemma 12: Let M = ( m.ij ) be a k x n binary matrix and 
0 < 8 < n — k. The following three conditions are equivalent: 

(1) There is an [n, k] linear code C over some field F such that 
M is a support generator matrix of C and d > n—k+1—6. 

(2) \(J ie j CmU)\ > l for any l £ [k] and any J C [n] of 
size | J\ = £ + 6. 

(3) | (Jig/ Rm 001 > n - k + |/| - 5 for all 0 I C [k]. 
Moreover, if condition (2) or (3) holds, there exists an [n, k} 
linear code over the field of size q > ff:. 1 ) with a support 
generator matrix M and minimum distance d > n — k + 1 — 8. 

Proof: The proof is given in Appendix A. ■ 

For (TV", <S)-group decodable code, we have the following 
two lemmas. 

Lemma 13: Suppose M is the indicator matrix of (A/", S). 
If M satisfies condition (2) of Lemma |T21 there exists an an 
(TV", 5)-group decodable code over the field of size q > ((V() 
with minimum distance d > n — k + 1 — 5. 

Proof: The proof is given in Appendix B. ■ 

Lemma 14: Let Mq = (rriij) be the incidence matrix of S. 
For any (TV, S )-group decodable code C, we have 

d<«; mi n(Mo)/3-r(Mo) + l. (8) 


Moreover, there exist an (TV, S )-group decodable code over 
the field of size q > (£“() with d = w m in(Mo)/3 — T(Mo) + l. 
Proof: The proof is given in Appendix C. ■ 

Now, we can prove Theorem [4] 

Proof of Theorem \4\ Suppose C is an (a, /3, k, t)-group 
decodable code. By Definition^ C is an (TV, <S)-group decod¬ 
able code for some S = {Si, • • • , S t } and TV = {ni, • • • , n t } 
such that Si C [k ], | Si \ — a and rij = Q for all i £ [t]. Let M 0 
be the incidence matrix of S. By Lemma [l4l it is sufficient 

to prove w min (Mo)/3 - T(M 0 ) + 1 < s/3 - jjy 


1. 


By Remark [9] each column of Mq has exactly a ones. 
Then the total number of ones in Mq is N one = ta. On the 
other hand, each row of Mq has at least ui m i n (TWo) ones. So 
AUe = ta > kw min {M 0 ), which implies w m i n (M 0 ) < 


Since w m in(Mo) is an integer, then we have 


Wmin(ALo) < 


ta 

~k 


( 9 ) 


Note that T(Mq) > 1. If k — r 


< (*), then we have 


TIT 


1, and © implies w m i n (Mo)/3 — T {Mq) + 1 < s(3 = sp — 
■^Ty +1. Thus, we only need to consider k — r> (*). Again 

by ©, we have the following two cases: 

Case 1: ru m in(-Vo) = s. Let N s be the number of rows of 
Mq with weight s. Then Mq has k — N s rows with weight at 
least s +1. So the total number of ones in Mq is N 0 ne = ta = 
sk + r > sN s + (s + 1 )(k — N s ) = ks + (k — N s ). Thus, 

N s > k — r. (10) 


ifr(Mo) < 


k—r 

07 


, then the repetition number of each row of 


weight w m in(TLo) = s is at most jry — 1. Note that there 
are at most (*) binary vector of length t and weight s. Then 


we have N s < (*) 


k—r 

717 


— 1 ] < k — r, which contradicts 


1. 


to m . So we have r(M 0 ) > jzy . Thus, w m i n (M 0 )/3 — 

T(Mq) + 1<s/3- 

Case 2: w m i n (Tfo) < s — 1. Note that ta = sk + r > sk 
and a < k. Then we have t > s and (*Zi) > 1. Thus, 


k — r < k < 


t - 1 
s — 1 

sk + r (t — 1 
s — 1 


k- 


r ft — 1 
s \s - 1 
ta ft — 1 


s Vs — 1 


So we have yjy < a, which implies that 


C) 


1 < 


— r 



< a < /3. 


Note that T(Mq) > 1. Then w n 


(Mo)/3 — T(_M 0 ) + 1 < 

1. 


w min (M 0 )P < (s - l)/3 = s/3 - (3 < s/3 - Vy 
By above discussion, we proved w m i n (Mo)/3 —T(Mq) + 1 < 


s/3 — 


k—r 

0) 


1. By Lemma [l4l d < sf3 — 


k—r 

(!) 


+ i. 


IV. Proof of Theorem[5] 

In this section, we prove Theorem 0 We first give a lemma 
that will be used in our following discussion. 

Lemma 15: Suppose ta = sk + r, where s > 1 and 0 < 
r < k — 1. \f k — r < (*), then there exists a k x t binary 
matrix Mq = (m,; ;J -) such that: (i) Each column of Mq has 
exactly a Is; (ii) w m i n (Mo) = s and T(Mq) = 1. 

Proof: The proof is given in Appendix D. ■ 

Now we can prove Theorem 0 



































Proof of Theorem [5} By Lemma [14] it is sufficient to 
construct a k x t binary matrix Mq such that each column has 


exactly a Is, w m i n (M 0 ) = s and T(M 0 ) = k r 
the following two cases: 


G) 


. We have 


Case 1: k — r < (*). Then k 
constructed by Lemma [15] 


07 


= 1 and Mn can be 


Case 2: k — r > (*). In this case, we can assume 


codes can be viewed as a subclass of locally repairable codes 
(LRC). We derive an upper bound on the minimum distance 
of such codes and prove that the bound is achievable for all 
possible code parameters. However, since GDC is a subclass 
of LRC, the minimum distance bound of GDC is smaller than 
the minimum distance bound of LRC in general. 

References 


k — r = u 


■ v 


where u > 1 and 0 < v < (*) — 1. Since ta = sk - 

't-V 

yS - lj 

't N 

yS, 


(ID 


r, then 


t 


a — u 


t-1 
s — 1 


= ta — tu 


= ta — su 


= ta — s(k — r — v) 

= (ta — sk) + s(r + v) 
= r + s(r + v). 


( 12 ) 


To do 


Let Mi be a «,(*) x t binary matrix such that each binary vector 
of length t and weight s appears in exactly u times. Then 
each column of Mi has exactly itQli) I s - We can further 

construct a (r + v) xt matrix M 2 and let Mo = 
so, we need to consider the following two sub-cases: 

Case 2.1: v = 0. By (fl2lt . t a — — ( s + l) r - It is 

easy to construct an r x t binary matrix M 2 such that each 
column has exactly a — ti(*w) Is and each row has exactly 


s+1 Is. Let Mq = 


Mi 

m 2 


. Then Mo is a kxt, binary matrix and 
each column has exactly a Is. Moreover, by the construction, 

we have w m in(Mo) = s and T(M) = u = jry 

Case 2.2: v 0. Then 0 < r < v + r — 1. Note that 
0<v< (*) — 1 and by (ThB . t a — = s(r + v) + r. 

By the same discussion as in Lemma ITSl we can construct a 
(r+v)xt binary matrix M 2 such that: (i) Each column of M 2 
has exactly a— u(*~i) Is; (ii) Wmin(Mf) = s and r(M 2 ) = 1. 


Let Mo = 


k—r 

TIT 


. Then Mo is a kxt binary matrix and each 
column has exactly a Is. Moreover, by the construction, we 

have w m in(M 0 ) = s and T(M) = u + 1 = 

Thus, we can always construct a k x t binary matrix Mq 
such that each column has exactly a Is, w m i n (Mo) = s and 

T(M 0 ) = jry ■ By Lemma IT4l there exist (AT, S )-group 

decodable code over the field of size q > (^Ii) with d = 


w n 


\(Mq)P — T(Mq) + 1 = s/3 — 


k—r 


(!) 

V. Conclusions 


+ 1 . 


We introduce a new family of erasure codes, called group 
decodable codes (GDC), for distributed storage systems that 
allows both locally repairable and group decodable. Thus, such 
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Appendix A 
Proof of Lemma fl2l 

The proof consists of three steps: In the first step, we prove 
condition (1) implies condition (2); In the second step, we 
prove condition (2) implies condition (3); In the third step, 
we prove that if condition (3) holds, then there exists an [n, k] 
linear code over the field of size q > CK) with a support 
generator matrix M and minimum distance d > n — k + 1 — 6. 

Proof of Lemma E} (1) =>■ (2). Suppose condition (1) 
holds. Let G = (a P; ) be a generator matrix of C supported 
by M. Then for any i £ [fc] and j £ [n], m, , = 0 implies 
a, , = 0. Given any £ £ [ k ], since any k + 5 columns of G has 
rank k (Lemma [TO , then any £ + 6 columns of G has rank at 
least £, i.e., rank(Gj) > £ for any J C [n] of size | J| = £ + 5. 
So Gj has at most k — £ rows that are all zeros, which implies 

\\} je jC M {j)\ > L 

(2) =>• (3). We can prove this by contradiction. Suppose 
0 ^ I C [ k ] and | (J- gJ Rm{t)\ < ti — k + \I\ — S. Let J' = 
WMJie/ Rm®. Then \J'\ > k — |/| + <5 and rriij = 0 for 
all i £ I and j £ J'. Let £ = k — |/| + 1 and J C J' such that 
\J\ = l. Then U jzjCMij) C [k]\I. So \ \J jeJ C M (j)\ < 
k — |/| = £ — 1, which contradicts to condition (2). Thus, it 
must be that | [J i6J Rm{i) > n — k + |/| — S. 

(3) =>- (1). The key is to construct a k x n matrix G over a 
field F of size q > (?~ ) such that G is supported by M and 
any k + S columns of G has rank k. 

Let X = (xij)kxn such that Xij is an indeterminant if 
rriij = 1 and Xij = 0 if m^j = 0. Let /(• • • , Xij, ■ ■ •) = 
ll/Aletf/'), where the product is taken over all k by k 
submatrix P of X with det(P) ^ O. Note that each Xij 
belongs to at most (?“ ) submatrix P and has degree at most 
1 in each det(P). Then Xij has degree at most (?“j) in 
/(••• Note that /(••• ,x id ,---) = n P det(P) ^ 

O. By [14, Lemma 4], if |F| > then there exist aij £ F 

(for i, j where = 1) such that /(• • • , aij, • • •) ^ 0. Let 
G = (aij) (for i. j where nii.j = 0, we set chj = 0). Then 
G is supported by M. We will prove rank(Gj) = k for any 
J C [n] with | J| = k + 6. By construction of G, it is sufficient 
to prove det(Xj 0 ) ^ O for some Jo Q J with | Jq| = k. 

Let Qj be the bipartite graph with vertex set U U V, where 
U = {ui]i £ [&]}, V = {vj-,j £ J} and U fl V = 0 
such that ( Ui,Vj ) is an edge of Qj if and only if m,;.j = 1. 
Then for each Ui £ U, the set of all neighbors of m is 
N(ui) = {vj'.j £ R,M{i ) H J}. So for all I C [k], the 
set of all neighbors of the vertices in S' = { m;i £ 1} 
is N(S) = {vj\j £ (Uig;%(*)) H J}. By assumption, 

|Uig/ Rm (^) | > n — k + \I\ — S and |J| = k + S. So 
we have \N{S)\ = \{J ieI R M (i) P J\ > |U,:g/^M(*)| - 
|[n]\J| = |/| = |S|. By Hall’s Theorem ([16, p. 419]), 
Qj has a matching which covers every vertex in U. Let 
M. = {(ui, vt x ),•••, (uk, Vi k )} be such a matching and 
To = {£u • ■ ■ ,£k}- Let Qj 0 be the subgraph of Qj generated 
by U U { Vj ; j £ Jo}. Then A4 is a perfect matching of Qj 0 
and Xj 0 is the Edmonds matrix of Qj 0 . It is well known ([17, 
p. 167]) that a bipartite graph has a perfect matching if and 


only if the determinant of its Edmonds matrix is not identically 
zero. Hence det(Xj 0 ) ^ O. 

By the construction of G, we have det(Gj 0 ) ^ 0 and 
rank(Gj) = k, where J is any subset of [n] and \J\ = k + S. 
Let C be the [n, k] linear code generated by G. By Lemma 
ma > n — k + 1 — 5. Note that we have proved that G is 
supported by M. So M is a support generator matrix of C. ■ 

Appendix B 
Proof of Lemma[T31 

Proof of LemmaU3\ Let G and C be constructed as in the 
proof of Lemma [12] We will prove that C is an ( M , S) -group 
decodable code. 

By Lemma ITOl we need to prove rank(Gj) = a for each 
i £ [i] and each J C J i of size |J| = a. To prove this, it 
is sufficient to construct a subset Jo C [n] such that J C J 0 
and rank(Gj 0 ) = k. To simplify notations, without loss of 
generality, we can assume J C J b where J\ is defined by (fT|. 
Since |J* S) = [ k ], we can always find a collection S' C5 
(By proper naming, we can assume S' = {Si, S 2 , • • • , S r }.) 
such that UU Si = [k] and I e = Se\ S t f $,£ = 

2, • • • , r. Then {Ji,J 2 ,-- - ,I r } is a partition of [fc], where 
Ii = Si. Let J'i = J and for each £ £ {2, • • • ,r}, pick 

an J{, C J^ with | J^| = \R\. Let J 0 = J[ U J 2 U • • • U J' r . 

Then |Jq| = k. Let Qj 0 be the bipartite graph with vertex 

set U U V, where U = {u£,i £ [fc]}, V = {vj;j £ Jo} and 

U fl V = 0 such that (it,, Vj ) is an edge of Qj 0 if and only if 
nii.j = 1. By Definition [ 8 ] to,.,- = 1 for each i £ It, j £ J' e 
and £ £ [r]. So each subgraph Qr f ,j' e is a complete bipartite 
graph and has a perfect matching, where Gi t ,j< is generated 
by {u£,i £ R} U {vf,j £ J{}. So the bipartite graph Qj 0 has 
a perfect matching. By a similar discussion as in the proof of 
Lemma fl2l rank(G,/ fl ) = k. So rank(Gj) = a. 

Moreover, by the proof of Lemma [12] G is supported by 
M and is a generator matrix of C. So by Lemma [10] C is an 
(TV, 5)-group decodable code. By Lemma fl2l d > n—k+l—S. 
So C is a code that satisfies our requirements. ■ 

As an example, let Mq be the matrix A in Example [ 6 ] 

Then a = 3, k = 7 and t = 8 . Let /3 = 5. Then M is 

obtained from Mo by replicating each column of Mo by 5 
times. By (0, Ji = {1, • ■ • , 5}, • • • , Js = {36, ■ ■ • , 40}. Let 

J = {1,3,5} C J x . We have h = Si = {1,4,6}, J 2 = 

Sf\Si = {2, 5, 7} and J 3 = Sf\{Si U Sf) = {3}. Moreover, 
we can pick J[ = J, J' 2 = { 6 , 7, 8 } and J 3 = {11}. Then Gj 0 
is of the following form: 

* * * 0 0 0 * 

0 0 0 * * * 0 

0 0 0 0 0 0 * 

* * * 0 0 0 0 

0 0 0 * * * 0 

* * * 0 0 0 * 

0 0 0 * * * 0 

where stars denote the nonzero entries of Gj 0 . Clearly, 
{(1,1), (2,4), (3, 6 ), (4, 2), (5, 5), ( 6 , 7), (7, 3)} is a perfect 
matching of the corresponding bipartite graph Qj 0 . By con¬ 
struction of G, we have det(Gj 0 ) ^ 0 and rank(Gj 0 ) = k = 7. 




Appendix C 
Proof of Lemma HU 

Proof of Lemma 177} Let M be the indicator matrix of 
(Af, S). By Lemma ITOl M is a support generator matrix of C. 
Let So be the smallest number such that | \Jj^j Cm(J)\ — £ 
for all £ £ [A] and all J C [n] of size | J| = £ + Sq. Then by 
Lemma [12] d <n — A +1 — Sq. By Lemma [12] and [13] there 
exists an (jV, S )-group decodable code over the field F of size 
q > with d = n — k + l — 5o. Thus, to prove this lemma, 

the key is to prove that S 0 = n — w m i n (M 0 )/3 — k + T(M 0 ). 

By Definition [ 8 ] Mo is a k x t binary matrix and M is a 
k x n binary matrix such that Cm 0 (*) = Si for all i £ [t] and 
Cm(J) = Si for all i £ [t] and j £ Ji. For each i £ [n], let 


€m{£) 


min 

JC[n],| J\=t 


U CmU) 

j&J 


Then by definition of Sq, we have 


(13) 


Note that |J| = t - w min (M 0 ). Then \R Mo {£')\ < |[f]\J| = 
t~\J\ = U> m i„(M 0 ). Thus, \R Mo (i')\ = -lUmin(M) =t-\J\ 
and J = [f]\f?M 0 {£'), which contradicts to assumption on J. 

By the above discussion, we proved that for each J C [n] of 
size |J| = i 0 , either | \J jeJ C Mo U)\ = k- T Mo (l) for some 
£ £ [k] with \R Mo (£)\ = w m i n (M 0 ) or \\JjeJ C M 0 {j)\ = k. 
Thus, by (O and Q, £m 0 Oo) = k — T(M 0 ). ■ 

Proof of Claim 2: From Definition [ 8 ] we have 


U CmU)= U Si>, VJ C [n]. (16) 

jeJ 


Firstly, we prove 
of size | J\ = £. 

By (O, we have 


Ui e j CmU) 


> (i) for each J C [n] 


(i — l)/3 + 1 < | J\ < ip. 

Note that by (0t, ./, | = p. Then the number of i' such that 
J H A' 7 ^ 0 is at least i. By (ITbl i and (fTZl i. we have 


<5o = min{<5; 0 < <5 < n — A, £,m{£ + £) > £, W £ [A:]}. (14) 
For each i £ [t], let 


U CmU) 

— 

U S v 

jeJ 




£.m 0 (i) 


min 

JC[n],| J\=i 


U c Mo U) 

(15) 

U c Mo {i') 





Then we have the following four claims: 

Claim 1 : Cm 0 (*o) = k- T(M 0 ) < k = £m 0 (*o + 1) = • ■ • = 
£m 0 (£), where i 0 = t - w min (M 0 ). 

Claim 2: For all i £ [f] and £ £ Ji, £m{£) = £m 0 (*)- 
Claim 3: £' - £m{£') < ioP ~ £m(*o/ 3), Vf £ [i 0 P]}- 
Claim 4: = i 0 P - £m 0 (*o)- 

Note that n = tp. Then Claims 1 and 4 imply that So = 
n — w m i n (Mo)P — k + T(Mq), which completes the proof. ■ 
Proof of Claim 1: Suppose J C [t\ and io + 1 < | J| < t. 
Then (Jjgj Cm 0 U) = [&]■ Otherwise, there is an £ £ [A] such 
that £ ^ Cm 0 (j ) for all j £ J, which implies that me,j = 0 
for all j £ J. So Rm 0 (£) C [t]\J and \Rm 0 (£)\ < |[f]\J| = 
t — | J| < t — (io + 1) = tUmin(Mo) — 1, which contradicts to 
©. Thus, we proved that \J jGJ Cm 0 U) = [&]• By we 
have £ M 0 {i ) = k for i 0 + 1 < i < t. 

Now, suppose J C [f] and | J| = i 0 = t — w m i n (M 0 ). We 
have the following two cases: 

Case 1: J = [f]\i?M 0 (7) for some £ £ [A] such that 
\Rm 0 (£)\ = Wmin(Mo). Then | \J jeJ Rm 0 U)\ = k-T Mo {£)- 

This can be proved as follows: 

For each £' £ [A] such that Rm 0 (£') = Rm 0 {£), we have 
mt'j = me,j = 0 for all j £ J. Thus, £' £ Ujgj Cm 0 U)- 
For each £' £ [A] such that Rm 0 {£') Rm 0 {£), since 

\Rm 0 {£)\ = tUmin(Mo), then R Mo {£') £ Rm 0 {£)- Note that 
J = [i]\f?M o C0- Then Rm 0 {£') O J ^ 0 and mcj 0 for 
some j £ J. So £' £ C Mo {j) and £' £ U JgJ C Mo (j)- 

Thus, for each £' £ [A], £' ^ U jeJ^MoU) if an d on iy if 
Rm 0 {£) = Rm 0 {£)- So \ \J jeJ C Mo (j)\ = k-T M {£). 

Case 2: J ^ [f]\J?M 0 (£) for all £ £ [A] such that |A?m 0 (£) \ = 
Wmin{M 0 ). Then | Ujg j Cm 0 0)1 = k. Otherwise, there is an 
£' £ [A] such that £' ^ Cm 0 U ) f° r all j £ J, which implies 
that my = 0 for all j £ J, and hence Rm 0 (£0 Q [£]\J. 


> Cm o (0 - 

The second equation holds because by Definition [ 8 ] for each 
i' £ [i], C Mo {j') = Sy. So by dj3>, we have f M {l) > £,M 0 {j). 

Secondly, we prove there exists a J C [n] of size | J| = 
such that 


UjejCWO) =£m 0 (*). 

By < IT5 ] i. there is a {j i, • • • , jj } C [f] such that 


(m 0 (*) = Ua=i Cmq 0 "a) 

Since £ £ Ji, then by <[5}, UaO^ ’l 


ULr Sj. 


(17) 


= (f - 1)0 < £ < 


Ua=i Jr = ip. So we can always find a subset J C [n] 

such that Ua=i Jjx 0 J Q Ua=i an£ i l^l = £• Then by 
( fl~ 6 l) and (flTb . we have 


U CmU) 

— 

u 

S^ 

j&J 




= 

Ua=i Sj x 



Above 


mm 

JC\n],\J\=£ 


= t,M 0 {i). 

discussion implies that £,M 0 (i) 

U CmU) ■ B y CGlf we have £ M (£) = £,M 0 {j). 
feJ 


Proof of Claim 3: We first prove 

ip - Cm(*/3) < ioP ~ Cm(*o/3), Vi G [i 0 ]- (18) 

For each i £ {1, 2, • • • , £—1}, by (IT5] >. there exists a J' C [f] 
of size | J'| = i such that 

£m o 0) = U f e j' CmoU) 
































Pick a jo £ [f]\J' and let J = J' U {jo}- Then by ( fl5] i. 

£m 0 (* + !)< U jeJ C Mo (j) ■ 

Above two equations imply that 

£m 0 (* + 1 ) - £m 0 (*) < Uje j c m q (j) - U fe j> Cm 0 (j) 
< \Cm 0 Uo)\ = |<Sj 0 | = a 
< 0 - 


Combining this with Claim 2, we have 

ifi - £,m{i 0) =i0- £m 0 (*) 

< (i + 1)0 — £m 0 ((* + 1)0) 

= (i + 1)0 — £m((* + l)/?)- 

By induction, we have 

0 ~ £m(0) <20- Cm( 20) <---<io0- £m(io0), 

which proves ([T 8 l >. 

Now, we can prove Claim 3. Given i £ [io] and £' £ Ji. 
Since by (0, [i —1)0+1 <£' < i0, and by Claim 2, £m(£') = 
£m 0 (*) = £m (*/?), then 

f'-MO <i0~tM(i0). 

Combining this with (fT 8 l >. we have 

0 ~ €m(0) < *00 ~ £m(* o0)- 

Note that by 0, [io0] = {1,2, • • • ,io0} = JiU J 2 U- • -U J; 0 . 
Thus, £' - \ M {0) < io0 - £m(*o/3), Vi' G [io0]}- ■ 

Proof of Claim 4: Denote <5 q = io0 — £m 0 (*o). We need 
to prove <5o = S' 0 . Since by Claim 2, £m(*o/3) = £m 0 (*o), then 
we have S' 0 = i o 0 - £,m {jo0)- 

Firstly, we prove £m{£ + S' 0 ) > £ f° r ^ £ [&]■ 

Supposed £ [k]. If £+S' 0 > io0 + l, then by 0, i+S' Q £ Ji 
for some i £ {*o +1, • • • ,t,}. By Claim 1 and 2, £,m{£ + S' 0 ) = 
£m 0 (*) = k > £,V£ £ [k]. If £ + Sq < io0, by Claim 3, 
(£+<5 q)—— ^o0—^,m{'Io 0) = S' 0 . So £m{£S-5' 0 ) > £. 
Thus, £m(£ + S' 0 ) > i for all i £ [k]. 

Secondly, we prove that if S' < S' 0 , then £,m{£ + S' 0 ) < £ 
for some £ £ [&]. We can prove this by contradiction. 

Suppose £m{£ + S o) > £ for all £ £ [k\. We have the 
following two cases: 

Case 1: io0 — S' £ [k\. Note that io0 — £m(*o/3) = S' 0 > S'. 
Then £m(* o 0) < io0 — S'. Let £ = io0 — S'. Then £ £ [k] and 
£,m{£ + S') < £, which contradicts to assumption. 

Case 2: i o 0 — S' ^ [fc]. Since io0 — £m(*o/3) = S' 0 > S', then 
i o 0 — S' > io0 — S' 0 = £m(* 00) > 0- So we have i o 0 — S' > k 
and io0 > k + S'. By ( | 1 3[ i and assumption, we have 

£m(*o/3) > £m(& + S') > k. 

By Claim 2, £m(* o0) = £m 0 (*o)- Then above equation implies 
£m(* o0) = £m 0 (*o) > k, which contradicts to Claim 1. 

In both cases, we can derive a contradiction. Thus, we 
conclude that ^m{£ + S' 0 ) < £ for some £ £ [fc]. 

Above discussion shows that S' 0 is the smallest number that 
satisfies the condition that if + S' 0 ) > £, Vi £ [k]. 


Thirdly, we prove 5' 0 < n — k. 

Let Jo and Qj 0 be constructed as in the proof of Lemma fl3l 
(We can denote J 0 = {ji, j 2 , • • • , jk}-)- Then Qj 0 has a per¬ 
fect matching. Thus, there exists a permutation (ii, * 2 , • • • ,ik) 
of (1, 2, • - ■ , k) such that rrii x j x = 1 for all A £ [k] and we 
have (Jjgj' CV(j) > \J'\ for all J' C J 0 . Now, for any 
l £ [fc] and J C [n] of size | J| = £+n — k, since | Jj| = k, we 
have | J n J 0 1 > L So | {J jeJ C M (j)\ > I U je jnj 0 c m{j)\ > 

| J fl Jo| > £. By (fl3l> . we have £m(£ + n — k) > £. Thus, 
we proved that S' = n — k also satisfies the condition that 


£,m{£ + 5')>£, Vi£[k\. 


Note that S q is the smallest number that satisfies the 
condition that £,m(£ + S' 0 ) > £, Vi £ [fe]. So S' 0 < n — k. 
Finally, we prove S’ 0 > 0. 

By (fl5] >. there exists a J C [f] such that |J| = i 0 and 


Cm 0 (* o) 


U C Mo {j) < E I Cm 0 {j)\ = E \Sj\ = ioa < 

jeJ jeJ jeJ 


i o 0- So by Claim 2, i o 0 - ^m(* o0) = * 0/9 - Cm 0 (*o) > 0. 

Thus, we proved that 0 < S' 0 < n — k and 5' 0 is the smallest 
number that satisfies the condition that £,m{£ + S' 0 ) > £, V£ £ 
[k\. By CS, we have S 0 = S' 0 = i o 0 - £m 0 (*o)- ■ 


Appendix D 
Proof of LemmaHaI 

Proof of Lemma 175} Since k — r< (*), we can construct 
a k x t binary matrix Mq = ( rriij) such that: 1 ) Rm o { 0), 
i = 1 , • • • , k — r, are mutually different and \Rm u (*) = s; 2 ) 
\R M q (*) | = s + l,i = k — r+1, ■■■ ,k. Since ta = sk + r 
and 0 < r < k — 1 , the total number of Is in Mq is 

None = {k — r)s + r(s + 1) = ks + r = ta. 

Clearly, Mq satisfies condition (ii). We can further modify Mq 
properly so that it satisfies conditions (i) and (ii). 

Suppose there is a ji £ [i] such that |CM 0 (ji)| < ce. Since 
the total number of ones in M is N onc = ta, there exists a 
j2 £ [I] such that | Cm 0 (J2) | > a. We shall modify Mq so that 
|C Mo (ji)| increases by one and \Cm 0 {J 2 )\ decreases by one. 
To do this, let 


I\ — {i; 1 < i < k — r, mij 1 = 1 and rrii J2 = 0 } 


and 


I 2 = {*; 1 < * < k — r, rrii^ 1 = 0 and rriij 2 = 1 }. 

Then clearly, J fl J =0 and rriij 1 = rrii t j 2 for all i £ 
{1, • • • , k — r}\{Ii U 12 ). We have the following two cases: 

Case 1: There is an i £ {k — r+1, • • • , k} such that m^j 1 = 
0, rn i,j 2 = 1 and \Rmq (*) I = s + 1. Then we modify M by 
letting TOj iJ:1 = 1 , rriij 2 = 0 . Then \Cm 0 {ji)\ increases by one 
and \Cmq (J 2 ) | decreases by one. Moreover, it is easy to see 
that Mq still satisfies condition (ii). 

Case 2: For all i £ {k — r + 1, • • • , k}, m,ij 2 = 1 implies 
m itjl = 1 . Note that \C Mo (ji)\ < a < \C Mo {j 2 )\, then we 
have |/i| < |J 21 - For each £ £ I 2 , we modify Mq by letting 
m£j 1 = 1 , mej 2 = 0 and the other entries of Mo remain 
unchanged. Denote the resulted matrix by Mi. Then | Cm ( (ji) | 










increases by one and \Cmi{H)\ decreases by one. If there is 
an t £ I 2 such that Mf does not satisfy condition (ii), it must 
be that Rm ( {£) — Rm e (£') for some £' £ I\. Moreover, all 
such £s and f s are in one to one correspondence. Note that 
I ft I < |^ 2 1- Then there exists an i 0 £ I 2 such that Mi 0 satisfies 
condition (ii). So we can let Mq be Mj 0 . 

We can perform the above operation continuously until each 
column of Mg has weight a. Thus, we obtain a matrix Mq 
that satisfies conditions (i) and (ii). ■ 


